22 research outputs found
Bayesian model predictive control: Efficient model exploration and regret bounds using posterior sampling
Tight performance specifications in combination with operational constraints
make model predictive control (MPC) the method of choice in various industries.
As the performance of an MPC controller depends on a sufficiently accurate
objective and prediction model of the process, a significant effort in the MPC
design procedure is dedicated to modeling and identification. Driven by the
increasing amount of available system data and advances in the field of machine
learning, data-driven MPC techniques have been developed to facilitate the MPC
controller design. While these methods are able to leverage available data,
they typically do not provide principled mechanisms to automatically trade off
exploitation of available data and exploration to improve and update the
objective and prediction model. To this end, we present a learning-based MPC
formulation using posterior sampling techniques, which provides finite-time
regret bounds on the learning performance while being simple to implement using
off-the-shelf MPC software and algorithms. The performance analysis of the
method is based on posterior sampling theory and its practical efficiency is
illustrated using a numerical example of a highly nonlinear dynamical
car-trailer system
Performance and safety of Bayesian model predictive control: Scalable model-based RL with guarantees
Despite the success of reinforcement learning (RL) in various research
fields, relatively few algorithms have been applied to industrial control
applications. The reason for this unexplored potential is partly related to the
significant required tuning effort, large numbers of required learning
episodes, i.e. experiments, and the limited availability of RL methods that can
address high dimensional and safety-critical dynamical systems with continuous
state and action spaces. By building on model predictive control (MPC)
concepts, we propose a cautious model-based reinforcement learning algorithm to
mitigate these limitations. While the underlying policy of the approach can be
efficiently implemented in the form of a standard MPC controller,
data-efficient learning is achieved through posterior sampling techniques. We
provide a rigorous performance analysis of the resulting `Bayesian MPC'
algorithm by establishing Lipschitz continuity of the corresponding future
reward function and bound the expected number of unsafe learning episodes using
an exact penalty soft-constrained MPC formulation. The efficiency and
scalability of the method are illustrated using a 100-dimensional server
cooling example and a nonlinear 10-dimensional drone example by comparing the
performance against nominal posterior MPC, which is commonly used for
data-driven control of constrained dynamical systems
Learning-based Moving Horizon Estimation through Differentiable Convex Optimization Layers
To control a dynamical system it is essential to obtain an accurate estimate
of the current system state based on uncertain sensor measurements and existing
system knowledge. An optimization-based moving horizon estimation (MHE)
approach uses a dynamical model of the system, and further allows for
integration of physical constraints on system states and uncertainties, to
obtain a trajectory of state estimates. In this work, we address the problem of
state estimation in the case of constrained linear systems with parametric
uncertainty. The proposed approach makes use of differentiable convex
optimization layers to formulate an MHE state estimator for systems with
uncertain parameters. This formulation allows us to obtain the gradient of a
squared and regularized output error, based on sensor measurements and state
estimates, with respect to the current belief of the unknown system parameters.
The parameters within the MHE problem can then be updated online using
stochastic gradient descent (SGD) to improve the performance of the MHE. In a
numerical example of estimating temperatures of a group of manufacturing
machines, we show the performance of tuning the unknown system parameters and
the benefits of integrating physical state constraints in the MHE formulation.Comment: This paper was accepted for presentation at the 4th Annual Conference
on Learning for Dynamics and Control. The extended version here contains an
additional appendix with more details on the numerical exampl
A predictive safety filter for learning-based racing control
The growing need for high-performance controllers in safety-critical
applications like autonomous driving has been motivating the development of
formal safety verification techniques. In this paper, we design and implement a
predictive safety filter that is able to maintain vehicle safety with respect
to track boundaries when paired alongside any potentially unsafe control
signal, such as those found in learning-based methods. A model predictive
control (MPC) framework is used to create a minimally invasive algorithm that
certifies whether a desired control input is safe and can be applied to the
vehicle, or that provides an alternate input to keep the vehicle in bounds. To
this end, we provide a principled procedure to compute a safe and invariant set
for nonlinear dynamic bicycle models using efficient convex approximation
techniques. To fully support an aggressive racing performance without
conservative safety interventions, the safe set is extended in real-time
through predictive control backup trajectories. Applications for assisted
manual driving and deep imitation learning on a miniature remote-controlled
vehicle demonstrate the safety filter's ability to ensure vehicle safety during
aggressive maneuvers
Approximate Predictive Control Barrier Functions using Neural Networks: A Computationally Cheap and Permissive Safety Filter
A predictive control barrier function (PCBF) based safety filter allows for
verifying arbitrary control inputs with respect to future constraint
satisfaction. The approach relies on the solution of two optimization problems
computing the minimal constraint relaxations given the current state, and then
computing the minimal deviation from a proposed input such that the relaxed
constraints are satisfied. This paper presents an approximation procedure that
uses a neural network to approximate the optimal value function of the first
optimization problem from samples, such that the computation becomes
independent of the prediction horizon. It is shown that this approximation
guarantees that states converge to a neighborhood of the implicitly defined
safe set of the original problem, where system constraints can be satisfied for
all times forward. The convergence result relies on a novel class
lower bound on the PCBF decrease and depends on the approximation error of the
neural network. Lastly, we demonstrate our approach in simulation for an
autonomous driving example and show that the proposed approximation leads to a
significant decrease in computation time compared to the original approach.Comment: Submitted to ECC2